機械学習の知識なしでイケる！日本コンピュータビジョンの顔認証SDKサンプルをさわってみた。

せーの

2021.03.17

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

せーのでございます。

最近ではいろいろな場所で顔を認識して熱があるかどうか感知したり、マスクをしているかどうか見分けたりするようなカメラが設置されているお店が増えてきました。 IT系のイベントでは事前に顔写真を登録しておいて本人確認として顔認証の仕組みを使うことも出てきました。

今回はその顔認証の仕組みをSDKにサクッとまとめて自社システムやサービスに組み込みやすくしている、日本コンピュータビジョン株式会社(JCV)の顔認証SDKを触ってみました。

特徴

顔認証、といえばAWSでもImage Rekognition、Video Rekognitionなどがありますが、JCVのこのSDKはライセンスの初回アクティベート以外はすべてオフラインで完結します。
ですので、カメラさえついていればスマホやタブレット、シングルボードなどでもこのSDKと画像データベースのみで完結するので使用用途が広いです。

また、すべての機能がAPI方式となっており、単純に顔の検知、顔同士を比較するだけではなく、トラッキング、生体検知(写真か実物かを見分ける)、特徴点の抽出などの機能を組み合わせることによって、自社サービスにあったソリューションを組み込むことができます。

なんでしょう、エンジニアにとってはワクワクする感じですね。
それでは早速サンプルを触っていきましょう。

やってみた

今回は2つの画像を検出し、その特徴点を比較して本人かどうか見分けるサンプルを試してみました。言語はJava版を使います。SDK自体はバージョンに依存しないので、JavaはJava11を使います。

若かりし頃

まずは代表の横田の画像で比べてみます。
若かりし頃。

そして現在。

あとは設定ファイル(ini形式)、それぞれの画像、画像の中の人の向き(0は上向き、正面)を引数としてサンプル「Detect_OneVSOne」を叩くだけです。

java Detect_OneVSOne initial.ini Satoshi_young.jpg Satoshi_adult.jpg 0

load license ok
score : 0.940
the same person
test finish!

94%で一致しました。ここはまあ小手調べ。

どれくらい昔まで問題ないのか、さかのぼってみたいと思います。

まず現在の免許証から顔を画像化します。

9年前の免許証の顔と比較します。

score : 0.939
the same person

一致しますね。次に12年前。

score : 0.924
the same person

一致します。15年前。

score : 0.940
the same person

18年前。

score : 0.927
the same person

すべて同一人物として一致しました。正面向いててこれ以上古い画像がないのであきらめます。

色々な画像で比べてみる

次にどこまで顔が隠れていても認識するのか試してみます。

元の画像はこちら。

メガネ

まずは軽いところから。

score : 0.966
the same person

当然のように一致します。

変顔

自らの力のみで顔を変えてみます。

score : 0.939
the same person

全く問題にしないですね。

サングラス

思い切って目を隠してみます。

score : 0.947
the same person

おお、これはすばらしい。

マスク

最近はみなさんマスクしてますので。

score : 0.908
the same person

マスクもいけるんですね。これはすごい。

マスク&サングラス

あわせ技でいってみます。

detect no face

ここにきて初めて顔を検出しませんでした。特徴点が隠れすぎていると難しいようです。

帽子(ハット)

頭を隠してみます。

score : 0.958
the same person

これは一致しました。

帽子(ニット)

ニット帽ならどうでしょう。

score : 0.945
the same person

これも見事に一致。

ネックガード

スポーツする時によく使うやつです。

detect no face

これは検出できませんでした。顔の検出には輪郭という要素が大きいのでしょうか。ということは、、、

Snowで加工

思いっきり輪郭を加工してみます。

score : 0.468
different person

別人と判断されました。やはり輪郭と目が両方違うとこまでいくと難しいですね。

アプリで女装

こちらも目と輪郭がだいぶ加工されています。

score : 0.470
different person

やはり別人と判断されるようです。

カイロレン from スターウォーズ

いっそのこと仮面をかぶってみました。

detect no face

そりゃそうか。

結論

ということで色々試してみましたが、顔認証にとって重要なのはどうも「輪郭」と「目」にあるようです。ですがサングラスやマスクなどある程度その特徴が隠れていても、残りの特徴点から人物を導き出せている分、システムに組み込むには全く問題ない精度と感じました。

サンプルコードを読んでみる

続けてサンプルコードを読んでみたいと思います。

コードとしてはAPIを並べるだけなので非常にシンプルです。
まずは顔認証SDKのライブラリと画像処理にOpenCVライブラリ(3.2.0)、Json処理にGSON(2.3.1)を組み込みます(画像はEclipseを使用)。

後はライセンスファイルをセットすれば使えるようになります。

さてコードを見てみましょう。

顔の向きを指定

顔の向きを事前に指定します。

int iamge_orientation = FaceproLibrary.STID_ORIENTATION_UP;
if (args.length == 4) {
  int orientation_index = Integer.parseInt(args[3]);
  if (orientation_index >= 0 && orientation_index <= 3) {
    iamge_orientation = face_orientation_list[orientation_index];
  } else {
    iamge_orientation = FaceproLibrary.STID_ORIENTATION_UP;
  }
}

顔の向きは上、右、左、下から選択し、デフォルトは上、となります。正面に向いている顔は上、ということで良さそうです。

顔の大体の大きさ、特徴点の検出頻度をセット

表示される顔のサイズをセットします。

int detect_config = 0;
String detect_config_str = mIniReader.getValue("others", "detector_config");
if (detect_config_str == null) {
  detect_config = FaceproLibrary.STID_FACEPRO_DETECTOR_CONFIG_LARGE_FACE;
} else {
  if (detect_config_str == "large") {
    detect_config = FaceproLibrary.STID_FACEPRO_DETECTOR_CONFIG_LARGE_FACE;
  }
  else if (detect_config_str == "small") {
    detect_config = FaceproLibrary.STID_FACEPRO_DETECTOR_CONFIG_SMALL_FACE;
  }
  else if (detect_config_str == "any") {
    detect_config = FaceproLibrary.STID_FACEPRO_DETECTOR_CONFIG_ANY_FACE;
  }
  else {
    detect_config = FaceproLibrary.STID_FACEPRO_DETECTOR_CONFIG_LARGE_FACE;
  }
}

int alignment_count = 0;
String alignment_count_str = mIniReader.getValue("others", "alignment_count");
if(alignment_count_str == null){
  alignment_count = 1;
}
else{
  alignment_count = Integer.parseInt(alignment_count_str);
  if(alignment_count < 0 || alignment_count > 0xff) {
    alignment_count = 1;
  }
}

映し出される顔のサイズや顔の検知に使う特徴点捜査の頻度を設定します。このサンプルではここら辺の設定値はiniファイルにまとめられていて、そこから取り出して使用するような形になっていますね。

モデルのロード

機械学習のモデルをロードします。

String alignment_model_path = mIniReader.getValue("resource", "alignment_model_filepath");
mDetectorHandle = mLibrary.createDetector(alignment_model_path, detect_config | alignment_count);
if (mDetectorHandle.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("create detect handle error: " + mDetectorHandle.getResultCode());
  break;
}

String verify_model_path = mIniReader.getValue("resource", "verify_model_filepath");
mVerifyHandle = mLibrary.featureExtractionCreateHandle(verify_model_path);
if (mVerifyHandle.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("create verify handle error: " + mVerifyHandle.getResultCode());
  break;
}

String compare_model_path = mIniReader.getValue("resource", "compare_model_filepath");
mCompareHandle = mLibrary.featureComparisonCreateHandle(compare_model_path);
if (mCompareHandle.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("create compare handle error: " + mCompareHandle.getResultCode());
  break;
}

alignment_model(顔検知)、verify_model(顔の特徴点を抽出する)、compre_model(顔の特徴点を比較する)と、3つのモデルファイルへのパスをロードしていますが、verify_modelとcompare_modelは同じモデルを使っているので、実質2つのモデルを組み合わせてソリューションを実現しているようです。

顔画像の読み取りと変換

OpenCVを使って顔画像を読み取り、独自のオブジェクトに変換します。

Mat image_buff1 = null;
try {
  image_buff1 = Imgcodecs.imread(args[1]);
} catch (Exception e) {
  e.printStackTrace();
}

if(image_buff1 == null) {
  System.out.printf("load image %s fails.", args[1]);
  break;
}
StidImage image1 = ImageConvert.imageCvToStid(image_buff1);
lap = System.currentTimeMillis();

Mat image_buff2 = null;
try {
  image_buff2 = Imgcodecs.imread(args[2]);
} catch (Exception e) {
  e.printStackTrace();
}

if(image_buff2 == null) {
  System.out.printf("load image %s fails.", args[2]);
  break;
}
StidImage image2 = ImageConvert.imageCvToStid(image_buff2);

このStidImageという独自オブジェクトですが、OpenCVのMatに入っている画像のbyte列、幅、高さ、チャンネル、step1(次元ごとのチャンネル数)などに加えて画像コーデック時のフォーマットが入ったものです。
今回はBGR888でのコーデックが指定されて変換されています。

顔の検出

検出用オブジェクトを生成して、それぞれの画像から顔を検出します。

DetectorResults mImg1DetectorResults = mLibrary.detector(
    mDetectorHandle.getHandle(),
    image1,
    iamge_orientation);
if(mImg1DetectorResults.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("detector error: " + mImg1DetectorResults.getResultCode());
  break;
}
if (mImg1DetectorResults.getDetectionResults() == null || mImg1DetectorResults.getDetectionResults().isEmpty() ||
  mImg1DetectorResults.getLandmarksResults() == null || mImg1DetectorResults.getLandmarksResults().isEmpty()) {
  System.out.println("img1 detect no face");
  break;
}

DetectorResults mImg2DetectorResults = mLibrary.detector(
    mDetectorHandle.getHandle(),
    image2,
    iamge_orientation);
if(mImg2DetectorResults.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("detector error: " + mImg2DetectorResults.getResultCode());
  break;
}
if (mImg2DetectorResults.getDetectionResults() == null || mImg2DetectorResults.getDetectionResults().isEmpty() ||
  mImg2DetectorResults.getLandmarksResults() == null || mImg2DetectorResults.getLandmarksResults().isEmpty()) {
  System.out.println("img2 detect no face");
  break;
}

Landmarks img1Landmarks = mImg1DetectorResults.getLandmarksResults().get(0);
Landmarks img2Landmarks = mImg2DetectorResults.getLandmarksResults().get(0);

画像情報を元にモデルによる推論をかけ、検知結果とランドマーク(特徴となるポイント)を検出します。このランドマークを使って特徴点を検出します。この時点で顔が検知できなければエラーを吐きます。

特徴点を検出

次に画像ごとの特徴点を検出します。

StringResult img1VerifyResult = mLibrary.getFeature(
    mVerifyHandle.getHandle(),
    image1,
    img1Landmarks);
if(img1VerifyResult.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("get img1 feature failed, err : " + img1VerifyResult.getResultCode());
  break;
}
StringResult img2VerifyResult = mLibrary.getFeature(
    mVerifyHandle.getHandle(),
    image2,
    img2Landmarks);
if(img2VerifyResult.getResultCode() != FaceproLibrary.STID_OK) {
  System.out.println("get img2 feature failed, err : " + img2VerifyResult.getResultCode());
  break;
}

これも単純にSDKからAPIを叩いているだけです。ここまで一切機械学習らしいことをしていないですね。

特徴点を比較

最後に画像ごとの特徴点を比較してスコアを出します。

FloatResult compareResult = null;
if (img1VerifyResult.getString() != null && img2VerifyResult.getString() != null) {
  compareResult = mLibrary.featureComparisonCompare(mCompareHandle.getHandle(), img1VerifyResult.getString(),
      img2VerifyResult.getString());
  if(compareResult.getResultCode() != FaceproLibrary.STID_OK) {
    System.out.println("compare two img failed, err : " + compareResult.getResultCode());
    break;
  }
  String compartStr = String.format("%.3f", compareResult == null ? 0.0F : compareResult.getFloat());
  System.out.println("score : " + compartStr);
  lap = System.currentTimeMillis();
  System.out.println("比較：" + (lap - startTime) + "ms");
  if (compareResult.getFloat() > 0.5) {
    System.out.println("the same person");
  } else {
    System.out.println("different person");
  }
}

ここでは顔特徴の比較モデルに入力値としてそれぞれの顔の特徴点をセットしています。結果はFloat値に変換し、0.5以上であれば同じ人、と判断しています。

まとめ

以上、JCVの顔認証SDKのサンプルを触ってみました。これが機械学習の知識なしで、ほとんどAPIの操作のみで実現可能になっていることは凄いことだと思います。
また当たり前にテストしていますが、昔の画像はぼやっとしていて鮮明ではなかったり、画素数が極端に小さいものもあったのですが、普通に推論ができている、この性能は素晴らしいと感じました。他にもJCVには生体検知や顔のトラッキングなどいろいろな機能があるようなので、別の機会で触ってみたいと思います。